home *** CD-ROM | disk | FTP | other *** search
-
-
-
-
-
-
- Network Working Group M. Sirbu
- Request for Comments: 1049 CMU
- March 1988
-
- A CONTENT-TYPE HEADER FIELD FOR INTERNET MESSAGES
-
- STATUS OF THIS MEMO
-
- This RFC suggests proposed additions to the Internet Mail Protocol,
- RFC-822, for the Internet community, and requests discussion and
- suggestions for improvements. Distribution of this memo is
- unlimited.
-
- ABSTRACT
-
- A standardized Content-type field allows mail reading systems to
- automatically identify the type of a structured message body and to
- process it for display accordingly. The structured message body must
- still conform to the RFC-822 requirements concerning allowable
- characters. A mail reading system need not take any specific action
- upon receiving a message with a valid Content-Type header field. The
- ability to recognize this field and invoke the appropriate display
- process accordingly will, however, improve the readability of
- messages, and allow the exchange of messages containing mathematical
- symbols, or foreign language characters.
-
- Table of Contents
-
- 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1
- 2. Problems with Structured Messages . . . . . . . . . . . . . . . 3
- 3. The Content-type Header Field . . . . . . . . . . . . . . . . . 5
- 3.1. Type Values . . . . . . . . . . . . . . . . . . . . . . 5
- 3.2. Version Number . . . . . . . . . . . . . . . . . . . . . 6
- 3.3. Resource Reference . . . . . . . . . . . . . . . . . . . 6
- 3.4. Comment. . . . . . . . . . . . . . . . . . . . . . . . . 7
- 4. Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . 7
-
- 1. Introduction
-
- As defined in RFC-822, [2], an electronic mail message consists of a
- number of defined header fields, some containing structured
- information (e.g., date, addresses), and a message body consisting of
- an unstructured string of ASCII characters.
-
- The success of the Internet mail system has led to a desire to use
- the mail system for sending around information with a greater degree
- of structure, while remaining within the constraints imposed by the
- limited character set. A prime example is the use of mail to send a
-
-
-
- Sirbu [Page 1]
-
- RFC 1049 Mail Content Type March 1988
-
-
- document with embedded TROFF formatting commands. A more
- sophisticated example would be a message body encoded in a Page
- Description Language (PDL) such as Postscript. In both cases, simply
- mapping the ASCII characters to the screen or printer in the usual
- fashion will not render the document image intended by the sender; an
- additional processing step is required to produce an image of the
- message text on a display device or a piece of paper.
-
- In both of these examples, the message body contains only the legal
- character set, but the content has a structure which produces some
- desirable result after appropriate processing by the recipient. If a
- message header field could be used to indicate the structuring
- technique used in the message body, then a sophisticated mail system
- could use such a field to automatically invoke the appropriate
- processing of the message body. For example, a header field which
- indicated that the message body was encoded using Postscript could be
- used to direct a mail system running under Sun Microsystem's NEWS
- window manager to process the Postscript to produce the appropriate
- page image on the screen.
-
- Private header fields (beginning with "X-") are already being used by
- some systems to affect such a result (e.g., the Andrew Message System
- developed at Carnegie Mellon University). However, the widespread
- use of such techniques will require general agreement on the name and
- allowed parameter values for a header field to be used for this
- purpose.
-
- We propose that a new header field, "Content-type:" be recognized as
- the standard field for indicating the structure of the message body.
- The contents of the "Content-Type:" field are parameters which
- specify what type of structure is used in the message body.
-
- Note that we are not proposing that the message body contain anything
- other than ASCII characters as specified in RFC-822. Whatever
- structuring is contained in the message body must be represented
- using only the allowed ASCII characters. Thus, this proposal should
- have no impact on existing mailers, only on mail reading systems.
-
- At the same time, this restriction eliminates the use of more general
- structuring techniques such as Abstract Syntax Notation, (CCITT
- Recommendation X.409) as used in the X.400 messaging standard, which
- are octet-oriented.
-
- This is not the first proposal for structuring message bodies.
- RFC-767 discusses a proposed technique for structuring multi-media
- mail messages. We are also aware that many users already employ mail
- to send TROFF, SCRIBE, TEX, Postscript or other structured
- information. Such postprocessing as is required must be invoked
-
-
-
- Sirbu [Page 2]
-
- RFC 1049 Mail Content Type March 1988
-
-
- manually by the message recipient who looks at the message text
- displayed as conventional ASCII and recognizes that it is structured
- in some way that requires additional processing to be properly
- rendered. Our proposal is designed to facilitate automatic
- processing of messages by a mail reading system.
-
- 2. Problems with Structured Messages
-
- Once we introduce the notion that a message body might require some
- processing other than simply painting the characters to the screen we
- raise a number of fundamental questions. These generally arise due
- to the certainty that some receiving systems will have the facilities
- to process the received message and some will not. The problem is
- what to do in the presence of systems with different levels of
- capability.
-
- First, we must recognize that the purpose of structured messages is
- to be able to send types of information, ultimately intended for
- human consumption, not expressable in plain ASCII. Thus, there is no
- way in plain ASCII to send the italics, boldface, or greek characters
- that can be expressed in Postscript. If some different processing is
- necessary to render these glyphs, then that is the minimum price to
- be paid in order to send them at all.
-
- Second, by insisting that the message body contain only ASCII, we
- insure that it will not "break" current mail reading systems which
- are not equipped to process the structure; the result on the screen
- may not be readily interpretable by the human reader, however.
-
- If a message sender knows that the recipient cannot process
- Postscript, he or she may prefer that the message be revised to
- eliminate the use of italics and boldface, rather than appear
- incomprehensible. If Postscript is being used because the message
- contains passages in Greek, there may be no suitable ASCII
- equivalent, however.
-
- Ideally, the details of structuring the message (or not) to conform
- to the capabilities of the recipient system could be completely
- hidden from the message sender. The distributed Internet mail system
- would somehow determine the capabilities of the recipient system, and
- convert the message automatically; or, if there was no way to send
- Greek text in ASCII, inform the sender that his message could not be
- transmitted.
-
-
-
-
-
-
-
-
- Sirbu [Page 3]
-
- RFC 1049 Mail Content Type March 1988
-
-
- In practice, this is a difficult task. There are three possible
- approaches:
-
- 1. Each mail system maintains a database of capabilities of
- remote systems it knows how to send to. Such a database
- would be very difficult to keep up to date.
-
- 2. The mail transport service negotiates with the receiving
- system as to its capabilities. If the receiving system
- cannot support the specified content type, the mail is
- transformed into conventional ASCII before transmission.
- This would require changes to all existing SMTP
- implementations, and could not be implemented in the case
- where RFC-822 type messages are being forwarded via Bitnet or
- other networks which do not implement SMTP.
-
- 3. An expanded directory service maintains information on mail
- processing capabilities of receiving hosts. This eliminates
- the need for real-time negotiation with the final
- destination, but still requires direct interaction with the
- directory service. Since directory querying is part of mail
- sending as opposed to mail composing/reading systems, this
- requires changes to existing mailers as well as a major
- change to the domain name directory service.
-
- We note in passing that the X.400 protocol implements approach number
- 2, and that the Draft Recommendations for X.DS, the Directory
- Service, would support option 3.
-
- In the interest of facilitating early usage of structured messages,
- we choose not to recommend any of the three approaches described
- above at the present time. In a forthcoming RFC we will propose a
- solution based on option 2, requiring modification to mailers to
- support negotiation over capabilities. For the present, then, users
- would be obliged to keep their own private list of capabilities of
- recipients and to take care that they do not send Postscript, TROFF
- or other structured messages to recipients who cannot process them.
- The penalty for failure to do so will be the frustration of the
- recipient in trying to read a raw Postscript or TROFF file painted on
- his or her screen. Some System Administrators may attempt to
- implement option 1 for the benefit of their users, but this does not
- impose a requirement for changes on any other mail system.
-
- We recognize that the long-term solution must require changes to
- mailers. However, in order to begin now to standardize the header
- fields, and to facilitate experimentation, we issue the present RFC.
-
-
-
-
-
- Sirbu [Page 4]
-
- RFC 1049 Mail Content Type March 1988
-
-
- 3. The Content-type Header Field
-
- Whatever structuring technique is specified by the Content-type
- field, it must be known precisely to both the sender and the
- recipient of the message in order for the message to be properly
- interpreted. In general, this means that the allowed parameter
- values for the Content-type: field must identify a well-defined,
- standardized, document structuring technique. We do not preclude,
- however, the use of a Content-type: parameter value to specify a
- private structuring technique known only to the sender and the
- recipient.
-
- More precisely, we propose that the Content-type: header field
- consist of up to four parameter values. The first, or type parameter
- names the structuring technique; the second, optional, parameter is a
- version number, ver-num, which indicates a particular version or
- revision of the standardized structuring technique. The third
- parameter is a resource reference, resource-ref, which may indicate a
- standard database of information to be used in interpreting the
- structured document. The last parameter is a comment.
-
- In the Extended Backus Naur Form of RFC-822, we have:
-
- Content-Type:= type [";" ver-num [";" 1#resource-ref]] [comment]
-
- 3.1. Type Values
-
- Initially, the type parameter would be limited to the following set
- of values:
-
- type:= "POSTSCRIPT"/"SCRIBE"/"SGML"/"TEX"/"TROFF"/
- "DVI"/"X-"atom
-
- These values are not case sensitive. POSTSCRIPT, Postscript, and
- POStscriPT are all equivalent.
-
- POSTSCRIPT Indicates the enclosed document consists of
- information encoded using the Postscript Page
- Definition Language developed by Adobe Systems,
- Inc. [1]
-
- SCRIBE Indicates the document contains embedded formatting
- information according to the syntax used by the
- Scribe document formatting language distributed by
- the Unilogic Corporation. [6]
-
- SGML Indicates the document contains structuring
- information to according the rules specified for
-
-
-
- Sirbu [Page 5]
-
- RFC 1049 Mail Content Type March 1988
-
-
- the Standard Generalized Markup Language, IS 8879,
- as published by the International Organization for
- Standardization. [3] Documents structured according
- to the ISO DIS 8613--Office Docment Architecture and
- Interchange Format--may also be encoded using SGML
- syntax.
-
- TEX Indicates the document contains embedded formatting
- information according to the syntax of the TEX
- document production language. [4]
-
- TROFF Indicates the document contains embedded formatting
- information according to the syntax specified for the
- TROFF formatting package developed by AT&T Bell
- Laboratories. [5]
-
- DVI Indicates the document contains information according
- to the device independent file format produced by
- TROFF or TEX.
-
- "X-"atom Any type value beginning with the characters "X-" is
- a private value.
-
- 3.2. Version Number
-
- Since standard structuring techniques in fact evolve over time, we
- leave room for specifying a version number for the content type.
- Valid values will depend upon the type parameter.
-
- ver-num:= local-part
-
- In particular, we have the following valid values:
-
- For type=POSTSCRIPT
-
- ver-num:= "1.0"/"2.0"/"null"
-
- For type=SCRIBE
-
- ver-num:= "3"/"4"/"5"/"null"
-
- For type=SGML
-
- ver-num:="IS.8879.1986"/"null"
-
- 3.3. Resource Reference
-
- resource-ref:= local-part
-
-
-
- Sirbu [Page 6]
-
- RFC 1049 Mail Content Type March 1988
-
-
- As Apple has demonstrated with their implementation of the
- Laserwriter, a very general document structuring technique can be
- made more efficient by defining a set of macros or other similar
- resources to be used in interpreting any transmitted stream. The
- Macintosh transmits a LaserPrep file to the Laserwriter containing
- font and macro definitions which can be called upon by subsequent
- documents. The result is that documents as sent to the Laserwriter
- are considerably more compact than if they had to include the
- LaserPrep file each time. The Resource Reference parameter allows
- specification of a well known resource, such as a LaserPrep file,
- which should be used by the receiving system when processing the
- message.
-
- Resource references could also include macro packages for use with
- TEX or references to preprocessors such as eqn and tbl for use with
- troff. Allowed values will vary according to the type parameter.
-
- In particular, we propose the following values:
-
- For type = POSTSCRIPT
-
- resource-ref:= "laserprep2.9"/"laserprep3.0"/"laserprep3.1"/
- "laserprep4.0"/local-part
-
- For type = TROFF
-
- resource-ref:= "eqn"/"tbl"/"me"/local-part
-
- 3.4. Comment
-
- The comment field can be any additional comment text the user
- desires. Comments are enclosed in parentheses as specified in
- RFC-822.
-
- 4. Conclusion
-
- A standardized Content-type field allows mail reading systems to
- automatically identify the type of a structured message body and to
- process it for display accordingly. The strcutured message body must
- still conform to the RFC-822 requirements concerning allowable
- characters. A mail reading system need not take any specific action
- upon receiving a message with valid Content-Type header field. The
- ability to recognize this field and invoke the appropriate display
- process accordingly will, however, improve the readability of
- messages, and allow the exchange of messages containing mathematical
- symbols, or foreign language characters.
-
-
-
-
-
- Sirbu [Page 7]
-
- RFC 1049 Mail Content Type March 1988
-
-
- In the near term, the major use of a Content-Type: header field is
- likely to be for designating the message body as containing a Page
- Definition Language representation such as Postscript.
-
- Additional type values shall be registered with Internet Assigned
- Numbers Coordinator at USC-ISI. Please contact:
-
- Joyce K. Reynolds
- USC Information Sciences Institute
- 4676 Admiralty Way
- Marina del Rey, CA 90292-6695
-
- 213-822-1511 JKReynolds@ISI.EDU
-
- REFERENCES
-
- 1. Adobe Systems, Inc. Postscript Language Reference Manual.
- Addison-Wesley, Reading, Mass., 1985.
-
- 2. Crocker, David H. RFC-822: Standard for the Format of ARPA
- Internet Text Messages. Network Information Center,
- August 13, 1982.
-
- 3. ISO TC97/SC18. Standard Generalized Markup Language.
- Tech. Rept. DIS 8879, ISO, 1986.
-
- 4. Knuth, Donald E. The TEXbook. Addison-Wesley, Reading, Mass.,
- 1984.
-
- 5. Ossanna, Joseph F. NROFF/TROFF User's Manual. Bell
- Laboratories, Murray Hill, New Jersey, 1976. Computing Science
- Technical Report No.54.
-
- 6. Unilogic. SCRIBE Document Production Software. Unilogic, 1985.
- Fourth Edition.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- Sirbu [Page 8]
-